Direct Value Learning: a Preference-based Approach to Reinforcement Learning

نویسندگان

  • David Meunier
  • Yutaka Deguchi
  • Riad Akrour
  • Einoshin Suzuki
  • Marc Schoenauer
  • Michele Sebag
چکیده

Learning by imitation, among the most promising techniques for reinforcement learning in complex domains, critically depends on the human designer ability to provide sufficiently many demonstrations of satisfactory quality. The approach presented in this paper, referred to as DIVA (Direct Value Learning for Reinforcement Learning), aims at addressing both above limitations by exploiting simple experiments. The approach stems from a straightforward remark: while it is rather easy to set a robot in a target situation, the quality of its situation will naturally deteriorate upon the action of naive controllers. The demonstration of such naive controllers can thus be used to learn directly a value function, through a preference learning approach. Under some conditions on the transition model, this value function enables to define an optimal controller. The DIVA approach is experimentally demonstrated by teaching a robot to follow another robot. Importantly, the approach does not require any robotic simulator to be available, nor does it require any pattern-recognition primitive (e.g. seeing the other robot) to be provided.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Low-Area/Low-Power CMOS Op-Amps Design Based on Total Optimality Index Using Reinforcement Learning Approach

This paper presents the application of reinforcement learning in automatic analog IC design. In this work, the Multi-Objective approach by Learning Automata is evaluated for accommodating required functionalities and performance specifications considering optimal minimizing of MOSFETs area and power consumption for two famous CMOS op-amps. The results show the ability of the proposed method to ...

متن کامل

Hierarchical Functional Concepts for Knowledge Transfer among Reinforcement Learning Agents

This article introduces the notions of functional space and concept as a way of knowledge representation and abstraction for Reinforcement Learning agents. These definitions are used as a tool of knowledge transfer among agents. The agents are assumed to be heterogeneous; they have different state spaces but share a same dynamic, reward and action space. In other words, the agents are assumed t...

متن کامل

Multicast Routing in Wireless Sensor Networks: A Distributed Reinforcement Learning Approach

Wireless Sensor Networks (WSNs) are consist of independent distributed sensors with storing, processing, sensing and communication capabilities to monitor physical or environmental conditions. There are number of challenges in WSNs because of limitation of battery power, communications, computation and storage space. In the recent years, computational intelligence approaches such as evolutionar...

متن کامل

Preference-Based Policy Iteration: Leveraging Preference Learning for Reinforcement Learning

This paper makes a first step toward the integration of two subfields of machine learning, namely preference learning and reinforcement learning (RL). An important motivation for a “preference-based” approach to reinforcement learning is a possible extension of the type of feedback an agent may learn from. In particular, while conventional RL methods are essentially confined to deal with numeri...

متن کامل

Web pages ranking algorithm based on reinforcement learning and user feedback

The main challenge of a search engine is ranking web documents to provide the best response to a user`s query. Despite the huge number of the extracted results for user`s query, only a small number of the first results are examined by users; therefore, the insertion of the related results in the first ranks is of great importance. In this paper, a ranking algorithm based on the reinforcement le...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012